Vectorized Operations

  • not necessary to write loops for element-by-element operations
  • pandas' Series objects can be passed to MOST NumPy functions

documentation: http://pandas.pydata.org/pandas-docs/stable/basics.html


In [7]:
import pandas as pd
import numpy as np

In [8]:
my_dictionary = {'a' : 45., 'b' : -19.5, 'c' : 4444}
my_series = pd.Series(my_dictionary)
my_series


Out[8]:
a      45.0
b     -19.5
c    4444.0
dtype: float64
add Series without loop

In [9]:
my_series + my_series


Out[9]:
a      90.0
b     -39.0
c    8888.0
dtype: float64

In [10]:
my_series


Out[10]:
a      45.0
b     -19.5
c    4444.0
dtype: float64
Series within arithmetic expression

In [11]:
my_series + 5


Out[11]:
a      50.0
b     -14.5
c    4449.0
dtype: float64
Series used as argument to NumPy function

In [12]:
np.exp(my_series)


Out[12]:
a    3.493427e+19
b    3.398268e-09
c             inf
dtype: float64

A key difference between Series and ndarray is that operations between Series automatically align the data based on label. Thus, you can write computations without giving consideration to whether the Series involved have the same labels.


In [13]:
my_series[1:]


Out[13]:
b     -19.5
c    4444.0
dtype: float64

In [14]:
my_series[:-1]


Out[14]:
a    45.0
b   -19.5
dtype: float64

In [15]:
my_series[1:] + my_series[:-1]


Out[15]:
a     NaN
b   -39.0
c     NaN
dtype: float64

Apply Python functions on an element-by-element basis


In [16]:
def multiply_by_ten (input_element):
    return input_element * 10.0

In [17]:
my_series.map(multiply_by_ten)


Out[17]:
a      450.0
b     -195.0
c    44440.0
dtype: float64

Vectorized string methods

Series is equipped with a set of string processing methods that make it easy to operate on each element of the array. Perhaps most importantly, these methods exclude missing/NA values automatically.


In [18]:
series_of_strings = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca', np.nan, 'CABA', 'dog', 'cat'])

In [19]:
series_of_strings.str.lower()


Out[19]:
0       a
1       b
2       c
3    aaba
4    baca
5     NaN
6    caba
7     dog
8     cat
dtype: object

In [ ]: